ordinal variable
Bounce: Reliable High-Dimensional Bayesian Optimization for Combinatorial and Mixed Spaces
Impactful applications such as materials discovery, hardware design, neural architecture search, or portfolio optimization require optimizing high-dimensional black-box functions with mixed and combinatorial input spaces. While Bayesian optimization has recently made significant progress in solving such problems, an in-depth analysis reveals that the current state-of-the-art methods are not reliable. Their performances degrade substantially when the unknown optima of the function do not have a certain structure. To fill the need for a reliable algorithm for combinatorial and mixed spaces, this paper proposes Bounce that relies on a novel map of various variable types into nested embeddings of increasing dimensionality. Comprehensive experiments show that Bounce reliably achieves and often even improves upon state-of-the-art performance on a variety of high-dimensional problems.
Omnipresent Yet Overlooked: Heat Kernels in Combinatorial Bayesian Optimization
Doumont, Colin, Picheny, Victor, Borovitskiy, Viacheslav, Moss, Henry
Bayesian Optimization (BO) has the potential to solve various combinatorial tasks, ranging from materials science to neural architecture search. However, BO requires specialized kernels to effectively model combinatorial domains. Recent efforts have introduced several combinatorial kernels, but the relationships among them are not well understood. To bridge this gap, we develop a unifying framework based on heat kernels, which we derive in a systematic way and express as simple closed-form expressions. Using this framework, we prove that many successful combinatorial kernels are either related or equivalent to heat kernels, and validate this theoretical claim in our experiments. Moreover, our analysis confirms and extends the results presented in Bounce: certain algorithms' performance decreases substantially when the unknown optima of the function do not have a certain structure. In contrast, heat kernels are not sensitive to the location of the optima. Lastly, we show that a fast and simple pipeline, relying on heat kernels, is able to achieve state-of-the-art results, matching or even outperforming certain slow or complex algorithms.
A Latent Causal Inference Framework for Ordinal Variables
Scauda, Martina, Kuipers, Jack, Moffa, Giusi
Ordinal variables, such as on the Likert scale, are common in applied research. Yet, existing methods for causal inference tend to target nominal or continuous data. When applied to ordinal data, this fails to account for the inherent ordering or imposes well-defined relative magnitudes. Hence, there is a need for specialised methods to compute interventional effects between ordinal variables while accounting for their ordinality. One potential framework is to presume a latent Gaussian Directed Acyclic Graph (DAG) model: that the ordinal variables originate from marginally discretizing a set of Gaussian variables whose latent covariance matrix is constrained to satisfy the conditional independencies inherent in a DAG. Conditioned on a given latent covariance matrix and discretisation thresholds, we derive a closed-form function for ordinal causal effects in terms of interventional distributions in the latent space. Our causal estimation combines naturally with algorithms to learn the latent DAG and its parameters, like the Ordinal Structural EM algorithm. Simulations demonstrate the applicability of the proposed approach in estimating ordinal causal effects both for known and unknown structures of the latent graph. As an illustration of a real-world use case, the method is applied to survey data of 408 patients from a study on the functional relationships between symptoms of obsessive-compulsive disorder and depression.
Asymptotically Exact and Fast Gaussian Copula Models for Imputation of Mixed Data Types
Christoffersen, Benjamin, Clements, Mark, Humphreys, Keith, Kjellstrรถm, Hedvig
Missing values with mixed data types is a common problem in a large number of machine learning applications such as processing of surveys and in different medical applications. Recently, Gaussian copula models have been suggested as a means of performing imputation of missing values using a probabilistic framework. While the present Gaussian copula models have shown to yield state of the art performance, they have two limitations: they are based on an approximation that is fast but may be imprecise and they do not support unordered multinomial variables. We address the first limitation using direct and arbitrarily precise approximations both for model estimation and imputation by using randomized quasi-Monte Carlo procedures. The method we provide has lower errors for the estimated model parameters and the imputed values, compared to previously proposed methods. We also extend the previous Gaussian copula models to include unordered multinomial variables in addition to the present support of ordinal, binary, and continuous variables.
Learning a binary search with a recurrent neural network. A novel approach to ordinal regression analysis
Falissard, Louis, Bounebache, Karim, Rey, Grรฉgoire
Deep neural networks are a family of computational models that are naturally suited to the analysis of hierarchical data such as, for instance, sequential data with the use of recurrent neural networks. In the other hand, ordinal regression is a well-known predictive modelling problem used in fields as diverse as psychometry to deep neural network based voice modelling. Their specificity lies in the properties of their outcome variable, typically considered as a categorical variable with natural ordering properties, typically allowing comparisons between different states ("a little" is less than "somewhat" which is itself less than "a lot", with transitivity allowed). This article investigates the application of sequence-to-sequence learning methods provided by the deep learning framework in ordinal regression, by formulating the ordinal regression problem as a sequential binary search. A method for visualizing the model's explanatory variables according to the ordinal target variable is proposed, that bears some similarities to linear discriminant analysis. The method is compared to traditional ordinal regression methods on a number of benchmark dataset, and is shown to have comparable or significantly better predictive power.
Guide to Encoding Categorical Features Using Scikit-Learn For Machine Learning
One of the most crucial preprocessing steps in any machine learning project is feature encoding. It is the process of turning categorical data in a dataset into numerical data. It is essential that we perform feature encoding because most machine learning models can only interpret numerical data and not data in text form. As usual, I will demonstrate these concepts through a practical case study using the students' performance in exams dataset on Kaggle. You can find the complete notebook up on my GitHub here.
Beyond One-Hot: an exploration of categorical variables
In machine learning, data are king. The algorithms and models used to make predictions with the data are important, and very interesting, but ML is still subject to the idea of garbage-in-garbage-out. With that in mind, let's look at a little subset of those input data: categorical variables. Categorical variables (wiki) are those that represent a fixed number of possible values, rather than a continuous number. Each value assigns the measurement to one of those finite groups, or categories. They differ from ordinal variables in that the distance from one category to another ought to be equal regardless of the number of categories, as opposed to ordinal variables which have some intrinsic ordering.
Ordinal regression - Wikipedia, the free encyclopedia
In statistics, ordinal regression (also called "ordinal classification") is a type of regression analysis used for predicting an ordinal variable, i.e. a variable whose value exists on an arbitrary scale where only the relative ordering between different values is significant. It can be considered an intermediate problem in between (metric) regression and classification.[1] Ordinal regression turns up often in the social sciences, for example in the modeling of human levels of preference (on a scale from, say, 1โ5 for "very poor" through "excellent"), as well as in information retrieval. In machine learning, ordinal regression may also be called ranking learning.[2][a] Ordinal regression can be performed using a generalized linear model (GLM) that fits both a coefficient vector and a set of thresholds to a dataset.
Metric Learning for Ordinal Data
Shi, Yuan (University of Southern California) | Li, Wenzhe (University of Southern California) | Sha, Fei (University of Southern California)
A large amount of ordinal-valued data exist in many domains, including medical and health science, social science, economics, political science, etc. Unlike image and speech datasets of real-valued data, learning with ordinal variables (i.e., features) presents unique challenges. In particular, the nominal differences between those feature values, which are just ranks, do not necessarily correspond to the real distances between the corresponding categories. Given their wide existence, it is imperative to develop machine learning algorithms that specifically address the need to model and infer with such data. In this paper, we present a novel metric learning algorithm that takes into consideration the nature of ordinal data. Our approach treats ordinal values as latent variables in intervals. Our algorithm then learns what those intervals are as well as distance metrics to measure distances between latent variables in those intervals. We derive the corresponding optimization algorithm and demonstrate how that can be solved effectively. Experimental results show that the proposed approach significantly improves baselines that do not explicitly model ordinal features.
Cumulative Restricted Boltzmann Machines for Ordinal Matrix Data Analysis
Tran, Truyen, Phung, Dinh, Venkatesh, Svetha
Restricted Boltzmann machines (RBMs) [36, 9, 20] have recently attracted significant interest due to their versatility in a variety of unsupervised and supervised learning tasks [35, 18, 25], and in building deep architectures [14, 31]. A RBM is a bipartite undirected model that captures the generative process in which a data vector is generated from a binary hidden vector. The bipartite architecture enables very fast data encoding and sampling-based inference; and together with recent advances in learning procedures, we can now process massive data with large models [13, 37, 2]. This paper presents our contributions in developing RBM specifications as well as learning and inference procedures for multivariate ordinal data. This extends and consolidates the reach of RBMs to a wide range of user-generated domains - social responses, recommender systems, product/paper reviews, and expert assessments of health and ecosystems indicators.